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The  pushback  estimates  sr?  r.  or  el  i..-.  i re  ry  ••  ••  h.- 
ao;:  i  f  icrtion  followed  by  the  r-  col  i  c  -  t  ion  of  - 
robust  statistic  to  the  modified  date.  ’orli ca¬ 

tion  of  configural  polysampl  in-;  techniques  rni  use 
of  the  minimum  attainable  variance  and  maximum 
attainable  po lye f f ic i nncy  derived  from  these  tech¬ 
niques  aid  in  fine-tuning  the  pushback  estimates. 
The  form  of  the  pushback  estimate  shown  by  tradi¬ 
tional  i'ont9  Carlo  methods  to  perform  well  in  com¬ 
parison  to  a  good  biweight  is  modified  and  the 
performance  is  improved. 


1.  In  t  ro'!  nr  ‘  *  ir, . 

T'i t  -c; ,n i ci!« s  app  uses  of  con *ig  ur  ol  « ,*• r  1  ip.'-  -.r/*  coo— 

figure!  nolysnnpl  ing  uer»  d  ascribe'**  ir  (  ~r.~  nr  ^  t  icon 

(lcn)l,  ( °r  eg  ibon  and  Tuhey  ( 1  *.?  1 )  )  and  O-py  . 

To  briefly  review,  the  uses  .ore: 

(1)  the  determination  of  the  min  in  urn  atta  inah!  r.  vari¬ 
ance  for  each  sampling  situation, 

(?)  the  determination  of  the  n^xinun  attainable  no  1  ye e- 
ficiency  for  several  sampling  situations  and 

(3)  the  tuning  of  a  robust  nrocerure  with  th°  aim  of 
increasing  its  efficiency  or  polyefficiency. 

(1)  can  be  achieved  using  configure!  sannling  or 
polysanpling  methods.  In  the  former,  no  -weights  are  used 
since  the  data  are  all  from  the  situation  under  considera¬ 
tion.  In  the  latter,  weights  (as  described  in  (Pregibon  and 
Tukey  (19f?l)  )  )  are  used  to  take  into  account  the  fact  that 
we  have  data  from  situations  other  than  that  for  which  we 
are  determining  the  minimum  variance.  The  results  discussed 
here  are  based  on  the  configural  polysampling  techniques 
(i.e.  the  'weighted  case). 

These  uses  of  configural  polysampling  are  apolied  to 
the  pushback  procedures.  The  pushback  estimates  are  defined 

*P re pa  red  in  connection  with  research  at  Princeton 
University,  suoported  by  the  Army  Research  Office  (Dur¬ 
ham)  . 
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ns  follow 5  :  'ruppose  we  a r •-?  c  ivor.  r.  obs-r-r.'? t  ions , 

y  i  >  y  2 »  •  •  •  i  /p  » 

from  a  particular  situation  [f.:  i  =  1 ,  .  ..,  n}  where  th«  f. 

i  i 

are  location  scale  densities.  The  pushback  procedure  modi¬ 
fies  the  order  statistics  of  the  n  observations, 

y(l)  ,  y  ( 2  )  ,  . . . ,  y  (  n)  , 

by  substracting  some  function  of  i,  o(i)  , 

y(l)-p(l),  y(2)-p(?) ,  y(n)-p(n)  . 

The  form  of  p(i)  considered  is 

p(i)  =  k*s  *a  (  i) 

where  k  is  a  constant,  s  is  an  estimate  of  the  scale  of  the 
{ y ( i )  }  and  { a (  i )  }  is  a  set  of  central  values  of  order 
statistics  from  a  suitable  unit  d i str ibut ion  .  Apolication 
of  a  robust  estimate  to  the  pushed-back  data  determines  the 
pushback  location  estimate  for  the  distribution  of  the 

C y ( i)  }. 

2.  Minimum  attainable  variances. 

As  seen  in  (Krystinik  (lDqlb)),  traditional  Monte  Carlo 
results  indicate  that  the  pushback  estimates  of  the  form 
P%AD-jaus-pushback  nedian  perform  well  when  naxinin  effi¬ 
ciencies  (with  respect  to  the  w5-bi we ight)  are  used  as  the 
criterion  of  performance.  T>re  check  this  conclusion  using 
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t  :•  ;  i  n  i  n  an  variance  •->  s  r  i  r.  a  r.  e  ?  for  to  *  ::i-  - :  -  . 

^v.iSoi-n  ~r.r'  z'-'.'s  slash.  Vs  i  r:7  “d"  co"  c:  ~  ur  -  “  i  Of* ,  l  ~ 

from  i.'.e  G-->  113a  i  an  .on:’  15'-  f  r  or.  Co?  si  *35  ,  •:«  o t  -• {  o  i :  r  :  “  u  : 
v  r  i  --.-ice  results  as  follows  (far  H  r  yst  i  n  i  (Iv'-.l-M  f  o*-  t 
r.stno'l  of  c.ol col  "t ion)  : 

m  i  r.  i n ur.  v a r  i  a-  nc» 

Gaussian  .  rf  o  p 

slash  .15  3 

Mtnough  stanf -rl  error  measur  emcnts  for  th°  v?ri?nc«  esti¬ 
mates  ere  still  in  the  rough  formulae  ion  s  ws  note  that 

these  estimates  should  be  fairly  well  determiner’  since,  in. 
using  co nfigural  polvsamol ing ,  we  ere  e  f  f  err  t  i'>ol  y  getting 
information  on  these  estimates  from  many  more  samples  (than 
configurations).  The  estimate  for  each  configuration  end 
the  variance  estimate  associated  with  it  contain  information 
for  the  many  samples  (r  and  s  varying;  see  (Tukey  and  Pregi- 
bon  (1951)))  associated  with  the  data,  configuration  [c(i)l. 

Since  the  minimum  varianc0  for  the  Gaussian  is  known  to 
be  .05  we  will  use  this  value  and  the  slash  minimum  variance 
value  given  above  to  calculate  efficiencies  for  the  p%Mj- 
Gaus-pushback  median.  These  are  shown  in  table  1  for  a 
range  of  P  from  37.5  to  75.  These  efficiences  are  calcu¬ 
lated  using  the  traditional  y.onte  Carlo  variances. 

From  table  1,  we  see  that  the  maximin  efficiency  is 
approximately  7*5  and  is  achieved  a.t  P=55  for  k=0.r'. 
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5^ficiencies* 

of  th 

e  P  3  \  r  -r 

aus-nushbeck 

mod i an 

for 

t  h  a  G 

Russian 

and  slash 

P  = 

*3  7  c 

4  5 

50**  55 

70 

7  5 

k 

slash  .4 

.755 

.753 

.773 

.759 

.715 

.8 

.775 

.  770 

.777  .700 

.  54  £ 

.410 

1.0 

.  7?2 

.755 

.752  .747 

.415 

.  344 

1.2 

.  735 

.715 

.670  .650 

.  373 

.  jot 

Gaussian  .4 

-Tn-J 

#  S.J  * 

.  509 

.693 

.  71-5 

.  7  23 

.8 

.  697 

.711 

.742  .758 

.8  90 

.933 

1.0 

.  712 

.  745 

.804  .826 

.947 

.  956 

1.2 

.733 

.  795 

.370  .291 

.  94  5 

.521 

*with  respect  to  the  configural  polysanDling  based 
minimum  variance  for  the  slash  and  .05  for  the 
Gaussian  minimum  variance 

**503AD  values  are  those  of  the  r*AD. 
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Thus  using  an  stimrte  of  the  actual  min  inur.t  atta  i  no  hi  o 
variance  for  the  slash  for  sample  size  2n  (rather  th-.o  r- 
best  known  for  which  v^-hi weight  :a?  ~  close  a  ppr  ox  in  ~  *■  i  o  o  , 
or  an  asymptotic  lower  hound)  an'1  the  two  situations  (Prus¬ 
sian  and.  slash)  which  ore  likely  to  cover  the  remaining 
3  (OW'G,  mix  and  slacu)  ,  we  obtain  conclusions  which  support 
those  obtained  using  the  w-l-biwe ight  variances  and  the  five 
situations.  We  limit  further  discussion  to  the  5 5 a us- 
pushback  median  form. 


!.  Maximum  attainable  biefficiency. 


Following  the  computations  discussed  in  C'rvstinik 

(1931a)),  we  obtain  the  biefficiency  for  the  two  situations, 
I  minvar  I 

i.a.  maxi  min  — — — rr-rl  •  The  biefficiency  for  sample  size 

t  13-a.s  v°r5(tl I 

20  is  9'r3.  The  bioptimal  curve  corresponding  to  different 
shadow  prices  (see  (Krystinik  and  Morgenthaler  (1931)))  for 
the  two  situations  is  shown  in  Figure  1.  This  optimal  effi¬ 
ciency  can  be  used  to  see  how  far  from  the  optimum  possible 
value  a  specific  robust  procedure  is.  For  example,  th° 
pushback  (55%AD-Gaus-pushback  median)  biefficiency  is 
The  pushback  is  doing  reasonably  well  but  some  fine-tuning 
to  increase  its  efficiency  would  be  desirable. 


4.  Fine-tuning  the  pushback. 

The  third  use  of  configural  polysampling,  i.o.  fine- 
tuning  robust  estimates,  here  the  pushback,  is  done  as  fol¬ 
lows.  Using  t=55%AD-Gaus-pushback  median,  we  calculate  t 
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noted.  '•  a  determine  wn ether  some  si  i  jht  modification  o  f  the 
pushback  procedure  will  brine  the  pushback  estimate  closer 
to  the  biefficient  estimate  in  these  con cigura t ions .  ?h° 

lata  snnole  shown  in  Figure  2  fas  a  sample,  not  on  th°  con¬ 
figuration  scale;  the  configuration  is  gust  »  rose ^ ’ eg  on'5 
translation  of  the  values)  is  an  example  of  »  configuration 
for  which  the  pushback  estimate  and  the*  bieffic  i°nt  estimate 
are  quite  different.  Note  also  that  the  wF-bi weight  is 
between  the  two.  Figure  2  shows  the  original  sample  with 
order  statistics  labelled  A-T.  The  pushback  data  arc  shown 
for  k=.°,  .9,  1.0,  1.1,  and  1.2  on  the  five  lines  *t  the 

bottom  of  the  figure.  Straight  lines  connect  the  original 
order  statistics  to  the  associated  pushback  values.  The 
bioptimal  estimate  is  shown  as  ^  on  the  figure,  the  5  F  2 * F- 
Ga us-pushback  median  as  J  ,  and  the  wF-hi weight  asj^ 

A  modification  which  eliminates  m  £  20  observations 
(where  m  depends  on  the  configuration)  far  from  the  center 
of  the  data  and  then  uses  the  set  {a(i)i,  i=l,  20 -m} ,  the 
central  order-statistic  values  for  a  Gaussian  sample  oc  size 
20-m,  is  suggested.  This  modification  tends  to  keep  centra) 
Gaussian-1 ike  points  and  uses  a  set  of  central  order- 
statistic  values  adapted  to  the  new  sample  size.  rne  form 
of  this  modification  that  has  been  shown  to  perform  v;°  1 1 
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al  and  pushback  data  for  a  sample  where  the  pushback 
te  different  from  the  biefficient  estimate, 
iefficient  estimate  and  w6-biweight  are  for  the 


o 


(s  ee  Andrews  et  si  ( 1  9~2 )  )  is  the  set  o f  skimming  pro¬ 
cedures.  Skipping  at  l.C  (1.5,  2.  '1 )  is  define'’  as  fo  ?.  1  ov;s : 

(1)  calculate  the  hinges  and  the  hinges'crea'*  , 

(2)  eliminate  observations  further  out  than  1.0  (1.5, 

2.0)  times  the  hingcsnread  from  either  hinge. 

The  skipping  procedures  were  tested  with  skipping  at 
1.0,  1.5,  and  2.0.  Figure  3  shows  the  apolication  of  skip¬ 
ping  at  2.0  to  the  data  shown  in  Figure  2.  The  skipped 
pushback  estimate  has  moved  closer  to  the  biefficient  esti¬ 
mate  and  is  closer  to  the  biefficient  estimate  than  the  w0- 
biweight  for  pushback  constants  k=.3,  .9,  1.0,  and  1.1. 

The  overall  performance  of  the  skipped  procedures  needs 
to  be  evaluated.  The  skipping  modification  may  inprovc-  the 
performance  of  the  pushback  for  the  configurations  on  which, 
the  pushback  and  biefficient  estimates  are  for  mart,  but  at 
the  same  time  make  the  pushback  estimates  worse  on  the  other 
configurations.  Table  2  shows  the  efficiencies  (w.r.t.  the 
minimum  variance  in  each  situation)  of  the  skipped  553AD- 
Ga us-pushback  median.  These  efficiencies  are  calculated  by 
obtaining  variance  estimates  for  the  skipped  55%AD-Gaus- 
pushback  median.  Skipped  55%AD-Gaus-pushback  medians  are 
calculated  for  the  seme  150/150  configurations  used  in 
obtaining  the  minimum  variance  estimates.  We  then  use  the 
relation 
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-■^c[t(^)[c}  =  I  c}  +  E  f  s2  I  c}  (t(c)-t^  (£)  )  2 

•./here  t  (c)  is  the  minimum  variance  estimate  cor  the  confi- 
o 

guration  end  [ t  (£) !_c)  }  is  the  rescale^  end  transl a ted  ver¬ 
sion,  t  (y)  =  r  ,  +  s  ,  • t  (c).  Combining  the  configuration 

o  oos  obs  o  —  1 

0 

level  information  Efs“!c}  with  the  optimal  estimate  values 
end  the  skipped  pushback  estimate  value,  we  obtain  v0-(t) 
for  a  given  configuration.  We  then  use  the  weights 
described  in  (Pregibon,  Tukey  (1971))  to  obtain  an  estimate 
of  the  unconditional  N'SE.  As  seen  in  table  2  the  bieffi¬ 
ciency  has  increased  from  7 3%  to  37.5%  due  to  the  configural 
polysampling  guided  modification  of  the  pushback. 

5.  Bioptimal  curves  and  possible  further  mod i f ications  of 
the  pushback. 

Figure  4  shows  the  bioptimal  curve  and  the  skipoed 
pushback  curves  for  fixed  skippinq  constants  and  those  for 
fixed  pushback  constant.  It  also  shows  the  bioptimal  one- 
step  biweight.  For  sample  size  2n,  S.  .Vo  rgenthal  er  (per¬ 
sonal  communication,  1931)  has  shown  the  best  one-stera 
biweight  to  be  the  w5. 75-biwe ight .  It  has  a  biefficiency  of 
•37%.  Thus  simple  estimates  in  the  form  of  skipped  pushbacks 
perform  very  well  in  comparison  to  the  maximum  attainable 
biefficiency  and  the  w6. 75-bi we ight . 

What  does  this  picture  (figure  4)  suggest  for  better 
choices  of  estimates  aimed  at  achieving  1)  higher  bieffi¬ 
ciency,  2)  high  slash  efficiency  with  90%  Gaussian 
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®  8Klp  2.0 
ft  Skip  1  •* 
+  8k»p  1.0 


Slath  Efficiency 


Figure  4:  Bioptimal  and  skipped  pushback  curves. 


F 

-In¬ 
efficiency,  and  3)  h i g h  slash  efficiency  with  0  5  3  Or- ussier, 
efficiency?  Th a  curvo  for  a  specified  skinpim  factor 
roughly  moves  down  on  the  e  f f ic iency/cf f ic i  encv  plot  as  t*1? 
skipping  factor  increases.  Thus  one  suggestion  for  increas¬ 
ing  biefficiency  is  a  pushback  with  skipping  factor  slightly 
larger  than  2.0. 

This  suggestion  may  also  be  useful  in  achieving  h.oher 
slash  efficiency  for  903  or  95%  Gaussian  efficiency.  The 
slope  on  the  right  side  of  the  fixed  skipping  factor  curvo 
increases  with  increasing  skipping  factor.  Thus  we  would 
expect  intersection  with  the  90%  or  95%  Gaussian  efficiency 
horizontal  line  at  a  higher  slash  efficiency.  The  gains 
from  increasing  the  skipping  constant  are  not  expected  to  be 
as  large  as  those  from  the  proposals  below. 

A  second  suggestion  for  increasing  biefficiency  and 
slash  efficiency  for  90%  or  95%  Gaussian  efficiency  is  the 
set  of  estimates  of  the  form 

8  skipped  +  (1-8)  unskipped  . 

The  pushback  constants  chosen  for  the  skipped  and  unskipoed 
versions  used  in  the  linear  combination  will  depend  on  which 
of  the  aims  1 )  —  3 )  is  considered.  Figure  4  indicates  that 
for  aims  2)  and  3)  larger  pushback  constants  should  be  used 
than  for  aim  1 )  . 

Preliminary  results  on  the  performance  of  estimates  of 
the  form 
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9  ski  ope-’  +  (1-9)  unshipped 

are  liven  in  figure  5.  Finuro  5  shows  the  skip  st  2.0  oush- 
back,  the  no  skip  pushback  anJ  the  linear  combination  push¬ 
back  efficiencies.  For  the  linear  combination  pushback,  the 
skipped  pushback  constant  used  is  1.2  and  the  u.oskipped 
pushback  constant  is  1.0.  These  results  indicate  that  esti¬ 
mates  of  the  linear  combination  form  are  likely  to  be  a  iood 
choice  for  aims  1 ) —  3 ) . 
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